The following single character keyboard shortcuts enable alternate display modes:
- 'f' enable fullscreen mode
- 'w' toggle widescreen mode
- 'o' enable overview mode
Download the slides
Here (Right Click > Save As…)
25 September, 2016
The following single character keyboard shortcuts enable alternate display modes:
Here (Right Click > Save As…)
為什麼需要視覺化?
John Snow 繪製了英國倫敦蘇活區的霍亂地圖,比對發現某一水井和感染者的居住地點相近,終於將汙染源鎖定在該水井,一周之後,市政府就封了這口水井,後來也證明霍亂是由汙水傳染而非一般認定的空氣傳染。
"The simple graph has brought more information to the data analyst's mind than any other device." – John Tukey
資料視覺化,重點在「
經典資料集:Anscombe's Quartet
#> vars n mean sd median trimmed mad min max range skew kurtosis #> x1 1 11 9.0 3.32 9.00 9.00 4.45 4.00 14.00 10.00 0.00 -1.53 #> x2 2 11 9.0 3.32 9.00 9.00 4.45 4.00 14.00 10.00 0.00 -1.53 #> x3 3 11 9.0 3.32 9.00 9.00 4.45 4.00 14.00 10.00 0.00 -1.53 #> x4 4 11 9.0 3.32 8.00 8.00 0.00 8.00 19.00 11.00 2.47 4.52 #> y1 5 11 7.5 2.03 7.58 7.49 1.82 4.26 10.84 6.58 -0.05 -1.20 #> y2 6 11 7.5 2.03 8.14 7.79 1.47 3.10 9.26 6.16 -0.98 -0.51 #> y3 7 11 7.5 2.03 7.11 7.15 1.53 5.39 12.74 7.35 1.38 1.24 #> y4 8 11 7.5 2.03 7.04 7.20 1.90 5.25 12.50 7.25 1.12 0.63 #> se #> x1 1.00 #> x2 1.00 #> x3 1.00 #> x4 1.00 #> y1 0.61 #> y2 0.61 #> y3 0.61 #> y4 0.61
| x1 | x2 | x3 | x4 | y1 | y2 | y3 | y4 |
|---|---|---|---|---|---|---|---|
| 10 | 10 | 10 | 8 | 8.04 | 9.14 | 7.46 | 6.58 |
| 8 | 8 | 8 | 8 | 6.95 | 8.14 | 6.77 | 5.76 |
| 13 | 13 | 13 | 8 | 7.58 | 8.74 | 12.74 | 7.71 |
| 9 | 9 | 9 | 8 | 8.81 | 8.77 | 7.11 | 8.84 |
| 11 | 11 | 11 | 8 | 8.33 | 9.26 | 7.81 | 8.47 |
| 14 | 14 | 14 | 8 | 9.96 | 8.10 | 8.84 | 7.04 |
| 6 | 6 | 6 | 8 | 7.24 | 6.13 | 6.08 | 5.25 |
| 4 | 4 | 4 | 19 | 4.26 | 3.10 | 5.39 | 12.50 |
| 12 | 12 | 12 | 8 | 10.84 | 9.13 | 8.15 | 5.56 |
| 7 | 7 | 7 | 8 | 4.82 | 7.26 | 6.42 | 7.91 |
| 5 | 5 | 5 | 8 | 5.68 | 4.74 | 5.73 | 6.89 |
是因為我們想要學會光譜的每一段,但事實上只要根據
要選擇什麼工具作圖?
希望上完課後
大家都能把這個能力帶回去
ggplot2 是一個很強大的資料探索及視覺化工具, 是許多最有影響力的 R 套件開發者 Hadley Wickham 所開發Grammar of Graphics 的作用就是幫助我們將圖表拆解成
It is far better to learn a language by actually speaking it!
mpg dataset:
Fuel economy data from 1999 and 2008 for 38 popular models of car.
| variable | detail |
|---|---|
| manufacturer | 車廠 |
| model | 型號 |
| displ | 引擎排氣量 |
| year | 出廠年份 |
| cyl | 氣缸數 |
| trans | 自/手排 |
| drv | f = front-wheel drive, r = rear wheel drive, 4 = 4wd |
| cty | city miles per gallon 城市駕駛油耗 |
| hwy | highway miles per gallon 高速公路駕駛油耗 |
| fl | 汽油: ethanol E85, diesel, regular, premium, CNG |
| class | 車型 |
先看兩個變數:
library(ggplot2) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
從圖表可歸納幾個結論:
mpg 資料畫不同的圖看看不同變數之間的相關
hwy vs cylclass vs drv## hwy vs cyl ggplot(data = mpg) + geom_point(mapping = aes(x = hwy, y = cyl))
## class vs drv ggplot(data = mpg) + geom_point(mapping = aes(x = class, y = drv))
在 x-y 二維的 Scatterplot 加入第三個 aesthetic
?geom_point: 查詢支援的 aestheticsggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))
試試在 x-y 二維的 Scatterplot 加入第三個 aesthetic
?geom_point: 查詢支援的 aestheticsggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, alpha = class))
試試在 x-y 二維的 Scatterplot 加入第三個 aesthetic
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, shape = class))
#> Warning: The shape palette can deal with a maximum of 6 discrete values #> because more than 6 becomes difficult to discriminate; you have 7. #> Consider specifying shapes manually if you must have them.
#> Warning: Removed 62 rows containing missing values (geom_point).
#> Warning: The shape palette can deal with a maximum of 6 discrete values #> because more than 6 becomes difficult to discriminate; you have 7. #> Consider specifying shapes manually if you must have them.
ggplot(data = <DATA>) + # Data
geom_<xxx>(
mapping = aes(<MAPPINGS>), ## <= Aesthetic mappings
stat = <STAT>,
position = <POSITION>
) +
scale_<xxx>() + coord_<xxx>() + facet_<xxx>()
theme_()
aes() 可以放在:
ggplot()裡面 – 有+ aes() – 有geom_<xxx>()裡面 – 無"記憶效果"(只對該 geom 有效)geom_<xxx>(inherit.aes=FALSE): overrides the default aesthetics.有時候你可能只想要手動設定某個
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), color = "blue")
如何查?
?geom_: 各 geom 有不同支援的 aestheticsBar Chart may be the most useful type!!